【深度学习】 图像识别实战 102鲜花分类(flower 102)实战案例 |
您所在的位置:网站首页 › R 分类数据集 › 【深度学习】 图像识别实战 102鲜花分类(flower 102)实战案例 |
文章目录
卷积网络实战 对花进行分类数据预处理部分网络模块设置网络模型的保存与测试数据下载:
1. 导入工具包2. 数据预处理与操作3. 制作好数据源读取标签对应的实际名字
4.展示一下数据5. 加载models提供的模型,并直接用训练好的权重做初始化参数6.初始化模型架构7. 设置需要训练的参数7. 训练与预测7.1 优化器设置7.2 开始训练模型7.3 训练所有层开始训练
8. 加载已经训练的模型9. 推理9.1 计算得到最大概率9.2 展示预测结果
写在最后
卷积网络实战 对花进行分类
本文主要对牛津大学的花卉数据集flower进行分类任务,写了一个具有普适性的神经网络架构(主要采用ResNet进行实现),结合了pytorch的框架中的一些常用操作,预处理、训练、模型保存、模型加载等功能 在文件夹中有102种花,我们主要要对这些花进行分类任务 文件夹结构 flower_data train 1(类别)2 xxx.png / xxx.jpgvalid 主要分为以下几个大模块 数据预处理部分 数据增强数据预处理 网络模块设置 加载预训练模型,直接调用torchVision的经典网络架构因为别人的训练任务有可能是1000分类(不一定分类一样),应该将其改为我们自己的任务 网络模型的保存与测试 模型保存可以带有选择性 数据下载:https://www.kaggle.com/datasets/nunenuh/pytorch-challange-flower-dataset 改一下文件名,然后将它放到同一根目录就可以了 下面是我的数据根目录 python目录点杠的组合与区别 注: 里面注明了点杠和斜杠的操作 3. 制作好数据源 data_transforms中制定了所有图像预处理的操作ImageFolder假设所有文件按文件夹保存好,每个文件夹下存储同一类图片 data_transforms = { # 分成两部分,一部分是训练 'train': transforms.Compose([transforms.RandomRotation(45), # 随机旋转 -45度到45度之间 transforms.CenterCrop(224), # 从中心处开始裁剪 # 以某个随机的概率决定是否翻转 55开 transforms.RandomHorizontalFlip(p = 0.5), # 随机水平翻转 transforms.RandomVerticalFlip(p = 0.5), # 随机垂直翻转 # 参数1为亮度,参数2为对比度,参数3为饱和度,参数4为色相 transforms.ColorJitter(brightness = 0.2, contrast = 0.1, saturation = 0.1, hue = 0.1), transforms.RandomGrayscale(p = 0.025), # 概率转换为灰度图,三通道RGB # 灰度图转换以后也是三个通道,但是只是RGB是一样的 transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # 均值,标准差 ]), # resize成256 * 256 再选取 中心 224 * 224,然后转化为向量,最后正则化 'valid': transforms.Compose([transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # 均值和标准差和训练集相同 ]), } batch_size = 8 image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir,x), data_transforms[x]) for x in ['train', 'valid']} dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, shuffle=True) for x in ['train', 'valid']} dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'valid']} class_names = image_datasets['train'].classes #查看数据集合 image_datasets {'train': Dataset ImageFolder Number of datapoints: 6552 Root location: ./flower_data/train StandardTransform Transform: Compose( RandomRotation(degrees=[-45.0, 45.0], interpolation=nearest, expand=False, fill=0) CenterCrop(size=(224, 224)) RandomHorizontalFlip(p=0.5) RandomVerticalFlip(p=0.5) ColorJitter(brightness=[0.8, 1.2], contrast=[0.9, 1.1], saturation=[0.9, 1.1], hue=[-0.1, 0.1]) RandomGrayscale(p=0.025) ToTensor() Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ), 'valid': Dataset ImageFolder Number of datapoints: 818 Root location: ./flower_data/valid StandardTransform Transform: Compose( Resize(size=256, interpolation=bilinear, max_size=None, antialias=None) CenterCrop(size=(224, 224)) ToTensor() Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) )} # 验证一下数据是否已经被处理完毕 dataloaders {'train': , 'valid': } dataset_sizes {'train': 6552, 'valid': 818} 读取标签对应的实际名字使用同一目录下的json文件,反向映射出花对应的名字 with open('./flower_data/cat_to_name.json', 'r') as f: cat_to_name = json.load(f) cat_to_name {'21': 'fire lily', '3': 'canterbury bells', '45': 'bolero deep blue', '1': 'pink primrose', '34': 'mexican aster', '27': 'prince of wales feathers', '7': 'moon orchid', '16': 'globe-flower', '25': 'grape hyacinth', '26': 'corn poppy', '79': 'toad lily', '39': 'siam tulip', '24': 'red ginger', '67': 'spring crocus', '35': 'alpine sea holly', '32': 'garden phlox', '10': 'globe thistle', '6': 'tiger lily', '93': 'ball moss', '33': 'love in the mist', '9': 'monkshood', '102': 'blackberry lily', '14': 'spear thistle', '19': 'balloon flower', '100': 'blanket flower', '13': 'king protea', '49': 'oxeye daisy', '15': 'yellow iris', '61': 'cautleya spicata', '31': 'carnation', '64': 'silverbush', '68': 'bearded iris', '63': 'black-eyed susan', '69': 'windflower', '62': 'japanese anemone', '20': 'giant white arum lily', '38': 'great masterwort', '4': 'sweet pea', '86': 'tree mallow', '101': 'trumpet creeper', '42': 'daffodil', '22': 'pincushion flower', '2': 'hard-leaved pocket orchid', '54': 'sunflower', '66': 'osteospermum', '70': 'tree poppy', '85': 'desert-rose', '99': 'bromelia', '87': 'magnolia', '5': 'english marigold', '92': 'bee balm', '28': 'stemless gentian', '97': 'mallow', '57': 'gaura', '40': 'lenten rose', '47': 'marigold', '59': 'orange dahlia', '48': 'buttercup', '55': 'pelargonium', '36': 'ruby-lipped cattleya', '91': 'hippeastrum', '29': 'artichoke', '71': 'gazania', '90': 'canna lily', '18': 'peruvian lily', '98': 'mexican petunia', '8': 'bird of paradise', '30': 'sweet william', '17': 'purple coneflower', '52': 'wild pansy', '84': 'columbine', '12': "colt's foot", '11': 'snapdragon', '96': 'camellia', '23': 'fritillary', '50': 'common dandelion', '44': 'poinsettia', '53': 'primula', '72': 'azalea', '65': 'californian poppy', '80': 'anthurium', '76': 'morning glory', '37': 'cape flower', '56': 'bishop of llandaff', '60': 'pink-yellow dahlia', '82': 'clematis', '58': 'geranium', '75': 'thorn apple', '41': 'barbeton daisy', '95': 'bougainvillea', '43': 'sword lily', '83': 'hibiscus', '78': 'lotus lotus', '88': 'cyclamen', '94': 'foxglove', '81': 'frangipani', '74': 'rose', '89': 'watercress', '73': 'water lily', '46': 'wallflower', '77': 'passion flower', '51': 'petunia'} 4.展示一下数据 def im_convert(tensor): """数据展示""" image = tensor.to("cpu").clone().detach() image = image.numpy().squeeze() # 下面将图像还原,使用squeeze,将函数标识的向量转换为1维度的向量,便于绘图 # transpose是调换位置,之前是换成了(c, h, w),需要重新还原为(h, w, c) image = image.transpose(1, 2, 0) # 反正则化(反标准化) image = image * np.array((0.229, 0.224, 0.225)) + np.array((0.485, 0.456, 0.406)) # 将图像中小于0 的都换成0,大于的都变成1 image = image.clip(0, 1) return image # 使用上面定义好的类进行画图 fig = plt.figure(figsize = (20, 12)) columns = 4 rows = 2 # iter迭代器 # 随便找一个Batch数据进行展示 dataiter = iter(dataloaders['valid']) inputs, classes = dataiter.next() for idx in range(columns * rows): ax = fig.add_subplot(rows, columns, idx + 1, xticks = [], yticks = []) # 利用json文件将其对应花的类型打印在图片中 ax.set_title(cat_to_name[str(int(class_names[classes[idx]]))]) plt.imshow(im_convert(inputs[idx])) plt.show()最后是1000分类,2048输入,分为1000个分类 而我们需要将我们的任务进行调整,将1000分类改为102输出 6.初始化模型架构步骤如下: 将训练好的模型拿过来,并pre_train = True 得到他人的权重参数可以自己指定一下要不要把某些层给冻住,要冻住的可以指定(将梯度更新改为False)无论是分类任务还是回归任务,还是将最后的FC层改为相应的参数官方文档链接 https://pytorch.org/vision/stable/models.html # 将他人的模型加载进来 def initialize_model(model_name, num_classes, feature_extract, use_pretrained = True): # 选择适合的模型,不同的模型初始化参数不同 model_ft = None input_size = 0 if model_name == "resnet": """ Resnet152 """ # 1. 加载与训练网络 model_ft = models.resnet152(pretrained = use_pretrained) # 2. 是否将提取特征的模块冻住,只训练FC层 set_parameter_requires_grad(model_ft, feature_extract) # 3. 获得全连接层输入特征 num_frts = model_ft.fc.in_features # 4. 重新加载全连接层,设置输出102 model_ft.fc = nn.Sequential(nn.Linear(num_frts, 102), nn.LogSoftmax(dim = 1)) # 默认dim = 0(对列运算),我们将其改为对行运算,且元素和为1 input_size = 224 elif model_name == "alexnet": """ Alexnet """ model_ft = models.alexnet(pretrained = use_pretrained) set_parameter_requires_grad(model_ft, feature_extract) # 将最后一个特征输出替换 序号为【6】的分类器 num_frts = model_ft.classifier[6].in_features # 获得FC层输入 model_ft.classifier[6] = nn.Linear(num_frts, num_classes) input_size = 224 elif model_name == "vgg": """ VGG11_bn """ model_ft = models.vgg16(pretrained = use_pretrained) set_parameter_requires_grad(model_ft, feature_extract) num_frts = model_ft.classifier[6].in_features model_ft.classifier[6] = nn.Linear(num_frts, num_classes) input_size = 224 elif model_name == "squeezenet": """ Squeezenet """ model_ft = models.squeezenet1_0(pretrained = use_pretrained) set_parameter_requires_grad(model_ft, feature_extract) model_ft.classifier[1] = nn.Conv2d(512, num_classes, kernel_size = (1, 1), stride = (1, 1)) model_ft.num_classes = num_classes input_size = 224 elif model_name == "densenet": """ Densenet """ model_ft = models.desenet121(pretrained = use_pretrained) set_parameter_requires_grad(model_ft, feature_extract) num_frts = model_ft.classifier.in_features model_ft.classifier = nn.Linear(num_frts, num_classes) input_size = 224 elif model_name == "inception": """ Inception V3 """ model_ft = models.inception_V(pretrained = use_pretrained) set_parameter_requires_grad(model_ft, feature_extract) num_frts = model_ft.AuxLogits.fc.in_features model_ft.AuxLogits.fc = nn.Linear(num_frts, num_classes) num_frts = model_ft.fc.in_features model_ft.fc = nn.Linear(num_frts, num_classes) input_size = 299 else: print("Invalid model name, exiting...") exit() return model_ft, input_size 7. 设置需要训练的参数 # 设置模型名字、输出分类数 model_ft, input_size = initialize_model(model_name, 102, feature_extract, use_pretrained = True) # GPU 计算 model_ft = model_ft.to(device) # 模型保存, checkpoints 保存是已经训练好的模型,以后使用可以直接读取 filename = 'checkpoint.pth' # 是否训练所有层 params_to_update = model_ft.parameters() # 打印出需要训练的层 print("Params to learn:") if feature_extract: params_to_update = [] for name, param in model_ft.named_parameters(): if param.requires_grad == True: params_to_update.append(param) print("\t", name) else: for name, param in model_ft.named_parameters(): if param.requires_grad ==True: print("\t", name) Params to learn: fc.0.weight fc.0.bias 7. 训练与预测 7.1 优化器设置 # 优化器设置 optimizer_ft = optim.Adam(params_to_update, lr = 1e-2) # 学习率衰减策略 scheduler = optim.lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1) # 学习率每7个epoch衰减为原来的1/10 # 最后一层使用LogSoftmax(), 故不能使用nn.CrossEntropyLoss()来计算 criterion = nn.NLLLoss() # 定义训练函数 #is_inception:要不要用其他的网络 def train_model(model, dataloaders, criterion, optimizer, num_epochs=10, is_inception=False,filename=filename): since = time.time() #保存最好的准确率 best_acc = 0 """ checkpoint = torch.load(filename) best_acc = checkpoint['best_acc'] model.load_state_dict(checkpoint['state_dict']) optimizer.load_state_dict(checkpoint['optimizer']) model.class_to_idx = checkpoint['mapping'] """ #指定用GPU还是CPU model.to(device) #下面是为展示做的 val_acc_history = [] train_acc_history = [] train_losses = [] valid_losses = [] LRs = [optimizer.param_groups[0]['lr']] #最好的一次存下来 best_model_wts = copy.deepcopy(model.state_dict()) for epoch in range(num_epochs): print('Epoch {}/{}'.format(epoch, num_epochs - 1)) print('-' * 10) # 训练和验证 for phase in ['train', 'valid']: if phase == 'train': model.train() # 训练 else: model.eval() # 验证 running_loss = 0.0 running_corrects = 0 # 把数据都取个遍 for inputs, labels in dataloaders[phase]: #下面是将inputs,labels传到GPU inputs = inputs.to(device) labels = labels.to(device) # 清零 optimizer.zero_grad() # 只有训练的时候计算和更新梯度 with torch.set_grad_enabled(phase == 'train'): #if这面不需要计算,可忽略 if is_inception and phase == 'train': outputs, aux_outputs = model(inputs) loss1 = criterion(outputs, labels) loss2 = criterion(aux_outputs, labels) loss = loss1 + 0.4*loss2 else:#resnet执行的是这里 outputs = model(inputs) loss = criterion(outputs, labels) #概率最大的返回preds _, preds = torch.max(outputs, 1) # 训练阶段更新权重 if phase == 'train': loss.backward() optimizer.step() # 计算损失 running_loss += loss.item() * inputs.size(0) running_corrects += torch.sum(preds == labels.data) #打印操作 epoch_loss = running_loss / len(dataloaders[phase].dataset) epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset) time_elapsed = time.time() - since print('Time elapsed {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60)) print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc)) # 得到最好那次的模型 if phase == 'valid' and epoch_acc > best_acc: best_acc = epoch_acc #模型保存 best_model_wts = copy.deepcopy(model.state_dict()) state = { #tate_dict变量存放训练过程中需要学习的权重和偏执系数 'state_dict': model.state_dict(), 'best_acc': best_acc, 'optimizer' : optimizer.state_dict(), } torch.save(state, filename) if phase == 'valid': val_acc_history.append(epoch_acc) valid_losses.append(epoch_loss) scheduler.step(epoch_loss) if phase == 'train': train_acc_history.append(epoch_acc) train_losses.append(epoch_loss) print('Optimizer learning rate : {:.7f}'.format(optimizer.param_groups[0]['lr'])) LRs.append(optimizer.param_groups[0]['lr']) print() time_elapsed = time.time() - since print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60)) print('Best val Acc: {:4f}'.format(best_acc)) # 保存训练完后用最好的一次当做模型最终的结果 model.load_state_dict(best_model_wts) return model, val_acc_history, train_acc_history, valid_losses, train_losses, LRs 7.2 开始训练模型我这里只训练了4轮(因为训练真的太长了),大家自己玩的时候可以调大训练轮次 #若太慢,把epoch调低,迭代50次可能好些 #训练时,损失是否下降,准确是否有上升;验证与训练差距大吗?若差距大,就是过拟合 model_ft, val_acc_history, train_acc_history, valid_losses, train_losses, LRs = train_model(model_ft, dataloaders, criterion, optimizer_ft, num_epochs=5, is_inception=(model_name=="inception")) Epoch 0/4 ---------- Time elapsed 29m 41s train Loss: 10.4774 Acc: 0.3147 Time elapsed 32m 54s valid Loss: 8.2902 Acc: 0.4719 Optimizer learning rate : 0.0010000 Epoch 1/4 ---------- Time elapsed 60m 11s train Loss: 2.3126 Acc: 0.7053 Time elapsed 63m 16s valid Loss: 3.2325 Acc: 0.6626 Optimizer learning rate : 0.0100000 Epoch 2/4 ---------- Time elapsed 90m 58s train Loss: 9.9720 Acc: 0.4734 Time elapsed 94m 4s valid Loss: 14.0426 Acc: 0.4413 Optimizer learning rate : 0.0001000 Epoch 3/4 ---------- Time elapsed 132m 49s train Loss: 5.4290 Acc: 0.6548 Time elapsed 138m 49s valid Loss: 6.4208 Acc: 0.6027 Optimizer learning rate : 0.0100000 Epoch 4/4 ---------- Time elapsed 195m 56s train Loss: 8.8911 Acc: 0.5519 Time elapsed 199m 16s valid Loss: 13.2221 Acc: 0.4914 Optimizer learning rate : 0.0010000 Training complete in 199m 16s Best val Acc: 0.662592 7.3 训练所有层 # 将全部网络解锁进行训练 for param in model_ft.parameters(): param.requires_grad = True # 再继续训练所有的参数,学习率调小一点\ optimizer = optim.Adam(params_to_update, lr = 1e-4) scheduler = optim.lr_scheduler.StepLR(optimizer_ft, step_size = 7, gamma = 0.1) # 损失函数 criterion = nn.NLLLoss() # 加载保存的参数 # 并在原有的模型基础上继续训练 # 下面保存的是刚刚训练效果较好的路径 checkpoint = torch.load(filename) best_acc = checkpoint['best_acc'] model_ft.load_state_dict(checkpoint['state_dict']) optimizer.load_state_dict(checkpoint['optimizer']) 开始训练注:这里训练时长会变得别慢:我的显卡是1660ti,仅供各位参考 model_ft, val_acc_history, train_acc_history, valid_losses, train_losses, LRs = train_model(model_ft, dataloaders, criterion, optimizer, num_epochs=2, is_inception=(model_name=="inception")) Epoch 0/1 ---------- Time elapsed 35m 22s train Loss: 1.7636 Acc: 0.7346 Time elapsed 38m 42s valid Loss: 3.6377 Acc: 0.6455 Optimizer learning rate : 0.0010000 Epoch 1/1 ---------- Time elapsed 82m 59s train Loss: 1.7543 Acc: 0.7340 Time elapsed 86m 11s valid Loss: 3.8275 Acc: 0.6137 Optimizer learning rate : 0.0010000 Training complete in 86m 11s Best val Acc: 0.645477 8. 加载已经训练的模型相当于做一次简单的前向传播(逻辑推理),不用更新参数 model_ft, input_size = initialize_model(model_name, 102, feature_extract, use_pretrained=True) # GPU 模式 model_ft = model_ft.to(device) # 扔到GPU中 # 保存文件的名字 filename='checkpoint.pth' # 加载模型 checkpoint = torch.load(filename) best_acc = checkpoint['best_acc'] model_ft.load_state_dict(checkpoint['state_dict']) def process_image(image_path): # 读取测试集数据 img = Image.open(image_path) # Resize, thumbnail方法只能进行比例缩小,所以进行判断 # 与Resize不同 # resize()方法中的size参数直接规定了修改后的大小,而thumbnail()方法按比例缩小 # 而且对象调用方法会直接改变其大小,返回None if img.size[0] > img.size[1]: img.thumbnail((10000, 256)) else: img.thumbnail((256, 10000)) # crop操作, 将图像再次裁剪为 224 * 224 left_margin = (img.width - 224) / 2 # 取中间的部分 bottom_margin = (img.height - 224) / 2 right_margin = left_margin + 224 # 加上图片的长度224,得到全部长度 top_margin = bottom_margin + 224 img = img.crop((left_margin, bottom_margin, right_margin, top_margin)) # 相同预处理的方法 # 归一化 img = np.array(img) / 255 mean = np.array([0.485, 0.456, 0.406]) std = np.array([0.229, 0.224, 0.225]) img = (img - mean) / std # 注意颜色通道和位置 img = img.transpose((2, 0, 1)) return img def imshow(image, ax = None, title = None): """展示数据""" if ax is None: fig, ax = plt.subplots() # 颜色通道进行还原 image = np.array(image).transpose((1, 2, 0)) # 预处理还原 mean = np.array([0.485, 0.456, 0.406]) std = np.array([0.229, 0.224, 0.225]) image = std * image + mean image = np.clip(image, 0, 1) ax.imshow(image) ax.set_title(title) return ax image_path = r'./flower_data/valid/3/image_06621.jpg' img = process_image(image_path) # 我们可以通过多次使用该函数对图片完成处理 imshow(img)
证明了通道提前了,而且大小没改变 9. 推理 img.shape # 得到一个batch的测试数据 dataiter = iter(dataloaders['valid']) images, labels = dataiter.next() model_ft.eval() if train_on_gpu: # 前向传播跑一次会得到output output = model_ft(images.cuda()) else: output = model_ft(images) # batch 中有8 个数据,每个数据分为102个结果值, 每个结果是当前的一个概率值 output.shape torch.Size([8, 102]) 9.1 计算得到最大概率 _, preds_tensor = torch.max(output, 1) preds = np.squeeze(preds_tensor.numpy()) if not train_on_gpu else np.squeeze(preds_tensor.cpu().numpy())# 将秩为1的数组转为 1 维张量 9.2 展示预测结果 fig = plt.figure(figsize = (20, 20)) columns = 4 rows = 2 for idx in range(columns * rows): ax = fig.add_subplot(rows, columns, idx + 1, xticks =[], yticks =[]) plt.imshow(im_convert(images[idx])) ax.set_title("{} ({})".format(cat_to_name[str(preds[idx])], cat_to_name[str(labels[idx].item())]), color = ("green" if cat_to_name[str(preds[idx])]==cat_to_name[str(labels[idx].item())] else "red")) plt.show() # 绿色的表示预测是对的,红色表示预测错了各位看官,都看到这里了,麻烦动动手指头给博主来个点赞8,您的支持作者最大的创作动力哟! 参考课程五大神经网络 才疏学浅,若有纰漏,恳请斧正 本文章仅用于各位作为学习交流之用,不作任何商业用途,若涉及版权问题请速与作者联系,望悉知 |
今日新闻 |
推荐新闻 |
CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3 |